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We report a possible solution to the trouble that the covariance fitting fails when the data is highly 
correlated and the covariance matrix has small eigenvalues. As an example, we choose the data 
analysis of highly correlated Bk data on the basis of the SU(2) staggered chiral perturbation 
theory. Basically, the essence of the problem is that we do not have an accurate fitting function so 
that we cannot fit the highly correlated and precise data. When some eigenvalues of the covariance 
matrix are small, even a tiny error of fitting function can produce large chi-square and spoil the 
fitting procedure. We have applied a number of prescriptions available in the market such as 
diagonal approximation and cutoff method. In addition, we present a new method, the eigenmode 
shift method which fine-tunes the fitting function while keeping the covariance matrix untouched. 
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1. Introduction 

We have reported results of Bk calculated using improved staggered fermions with Nf = 2+1 
flavors in Ref. [§]. In Ref. we use the diagonal approximation (uncorrelated fitting) instead of 
the full covariance fitting. This is due to the fact that the % 2 value was out of range, which indicates 
that the full covariance fitting fails manifestly. One of the most frequently asked questions on 
Ref. [JXJ] is why we do the uncorrelated fitting instead of the full covariance fitting. 

Here, we provide an elaborate answer to why we use the diagonal approximation. In addition, 
we propose a new method, named the eigenmode shift (ES) method, which fine-tunes the fitting 
function while keeping the covariance matrix untouched. More details on this issue will be reported 
in Ref. [§. 



2. Covariance fitting 

First, we review the covariance fitting. Then, we would like to address the possible failure of 
the covariance fitting, which originates from the truncation error of the fitting function in the series 
expansion of the staggered chiral perturbation theory (SChPT). 

Let us consider N samples of unbiased estimates of quantity with i = 1,2,3,... ,D. Here, 
the data set is {yi(n)\n = 1,2,3, ... ,N}. Let us assume that the samples y,(n) are statistically 
independent in n for fixed i but are substantially correlated in i. An introduction to this subject is 
given in Ref. @, ||@]. 

We are interested in the probability distribution of the average of the data y,-(n), defined by 
yi = jj lLn=iyi( n )- We assume that the measured values of y,- have a normal distribution P(y) by the 
central limit theorem for the multivariate statistical analysis as follows: 



, . 1 

P(y) = - exp 



if fr-mXNrJXyj-Llj) 



(2.1) 



where p,- represents the true mean value of y ; -, which is, in general, unknown and can be obtained 
as N — > °o, and Z is the normalization constant. Here, is the true covariance matrix, which is, in 



1 

N 

sample covariance matrix of mean, Cy, defined as follows, 



general, unknown in our problems. The maximum likelihood estimator of — Ty turns out to be the 



1 N 

Ci i = N(N-l) ^ ^ ^ ~ ^ ^ j ^ ~ ^ ' (2 ' 2) 

Let us consider a fitting function, f^{Xi\c a ). Here, X,- are the input variables which define data 
points and c a are fitting parameters. What we want to do is to determine the fitting parameters 
to give the best fit and to test whether the fitting function describes the data reliably from the 
standpoint of statistics. Here, the best fit is defined by minimizing the T 2 , where T 2 is 

t 2 = t fo-MXiWjm-MXj)). (2.3) 



In ideal case, the best fit gives the true mean of the data, jit,-, in Eq. (2.1). We notice that yN\y 



/th(X,-)] is distributed according to the multivariate normal distribution, JV (p,r), where p,- 
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VN[jli -/th(X,-)]. In this case, [T 2 / (N - \)][(N - d) / d) is distributed as the noncentral F dis- 
tribution of Fii.N-d, which is defined in Ref. with noncentrality parameter K, defined by K = 
J^jj PiTT. l pj. Here, d is the degrees of freedom of the fitting. In Ref. []|], it is proved that the limit- 
ing distribution of T 2 as N — > °° is the ^-distribution with d degrees of freedom if /thpC ) = Mi- 



The multivariate statistical theory predicts the following [f7|] : 

d + l 



g(T 2 ) 

y{T 2 ) 



{d + K 
2(d + 2x 
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(2.4) 



(2.5) 



N \ " ' d + 2K J ' " W 2/ 

where <f (r 2 ) and y(T 2 ) represent the expectation value and variance of the T 2 , defined in Eq. (2.3 ). 
Here, d is the degrees of freedom and k is the noncentrality parameter. If the fitting function is 
exact (which means f t h{Xj) = Hi), the noncentrality parameter is zero. In that case, if we have large 
enough number of data samples to ignore the @(\/N) terms, we expect that the T 2 has a value 
around the degrees of freedom, T 2 = d± yM. 

2.1 Inexact fitting function 

One caveat is that the covariance fitting works only if the fitting function is precise enough. 
In practice, we determine the fitting function based on the SChPT and it is given as a series of 
&{p 2n ). Since we can include only the finite number of terms in the series, we usually truncate the 
series at a certain higher order. As a consequence, the fitting function has a truncation error which 
makes it inexact in some high precision. This usually does not cause much trouble. However, if the 
covariance matrix has a very small eigenvalue, A/, the truncation error can be amplified by a factor 

of — , and then, sometimes, causes failure of the covariance fitting. 
Ai 



To see this, let us rewrite the Eq. (2.3) using the eigenmode decomposition: 



[Cif 1 ] 
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fc=l A * 
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k=l A k 



(2.6) 



where A* and \vk) are eigenvalues and eigenvectors of the covariance matrix Cy, respectively. Here, 
the average data points and the fitting function values are also written in bra-ket vector notation, 
\y)i = yt and |/ t h),- = /th(Xi). If an eigenvalue A/ is very small, the T 2 is dominated by the corre- 
sponding eigenmode. The fitting procedure works very hard to minimize the difference between 
the average data points and the fitting function value, — in |v/) direction. If the fitting 
function has error in |v/) direction, the fitting procedure endeavor to fit in wrong direction, losing 
precisions in other directions. Even if the error of fitting function is small, the lost precisions in 



other directions can yield significant error of fitting result. Section 2.2 exemplifies this situation. 



If we have large number of samples, Eq. (2.4) and Eq. (US) can be approximated by 



T 2 = d + K± v / 2(d + 2K), 



(2.7) 



where d is the degrees of freedom of the fitting and fc is the noncentrality parameter. Using the 
eigenmode decomposition, the K can be written as 



1 



K= L a-<M-/*|vjt> 



(2.8) 



k=i 
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where jit, are the true mean of y,. 1 Therefore, the error of fitting function, (ju — fth), increases the 
minimized value of T 2 . Even if the error is small, tiny eigenvalues amplify the K. 

2.2 Trouble with covariance fitting for Bk 

To demonstrate the problem, we choose the Bk data on the C3 (coarse) ensemble of Ref. [jl]]. 
This ensemble is particularly a good sample, because it has relatively large statistics. It contains 
671 configurations and we measured 9 times for each configuration. Details are given in Ref. 

The fitting functional form suggested by the SU(2) staggered chiral perturbation theory (SChPT) 
is linear as follows: 



MX) = 1 £c a F a (X), 

a=l 



(2.9) 



where c a are the low energy constants (LECs) and F a are functions of X, which represents col- 
lectively Xp (pion squared mass of light valence (anti-)quarks), Yp (pion squared mass of strange 
valence (anti-)quarks), and so on. The details on F a and X are given in Ref. [g]|. Here, we focus on 
the X-fit of 4X3Y-NNLO fitting of the SU(2) SChPT, which is explained in great detail in Ref. [@]. 
Since we have only 4 data points, we truncated higher order terms in the fitting function and we 
have three LECs so P = 3. The neglected highest order term in the fth(X) is X 2 (ln(X)) 2 « 0.006, 
where X = Xp/A 2 rj 0.02. Hence, the fitting function has an error in that order. 

In the X-fit, we fix am y = 0.05 and select 4 data points of am x = 0.005, 0.010, 0.015, 0.020 
to fit to the functional form suggested by the SU(2) SChPT as in Ref. [jj]]. Hence, the covariance 
matrix Cy is a 4 x 4 matrix. Its eigenvalues are 



Xi = { 1.95 x 10~ 5 , 1.92 x 10~ 6 , 7.58 x 10~ 8 , 1.11 x 10~ 9 }. 



(2.10) 



Due to the high correlation of data, the smallest eigenvalue is smaller than the largest eigenvalue 
by four orders of magnitude. Let us look into the eigenvectors, 
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0.542 


0.0725 


-0.790 


0.200 




0.546 




0.639 




0.503 



(2.11) 



The eigenvector | V4) corresponds to the smallest eigenvalue and it dominates the fitting completely. 



In Fig. 1(a), we show the fitting results with the full covariance matrix. As one can see, the 
fitting curve does not pass through the data points. The T 2 value is 7.2 with degrees of freedom 1, 
which indicates that the fitting fails manifestly. Let us perform the eigenmode decomposition on 
\y) and |/ t h) as follows: 

4 4 
|y> = J>|v,), |/*> = J>|vi) (2.12) 

1=1 1=1 

where a, and bj are the eigenmode projection coefficients. As we can see in Table [j] the difference 
is 1.75a for |vi), and 1.7a for \v2), whereas it is only 0.33a for |v4). Hence, the procedure of the 



'Here, we assume that we have large enough number of data samples so that the A* and \v%) of sample covariance 
matrix Cy are fairly representing those of the true covariance matrix. 
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Figure 1: B K (l/a) vs. X P on the C3 ensemble. The fit type is 4X3Y-NNLO in the SU(2) analysis. The 
red line of left figure represents the results of fitting with the full covariance matrix. The red line of right 
figure represents the results of fitting with the uncorrected fitting using diagonal approximation. The red 
diamond corresponds to the Bk value obtained by extrapolating m x to the physical light valence quark mass 
after setting all the pion multiplet splittings to zero. 

covariance fitting works hard for the coefficient of |vzt) but works less precisely for the coefficients 
of | vi ) and | V2), mainly because the eigenvalue A4 is significantly smaller than X\ and %i. The irony 
is that the average data points, \y), has only 0.015% overlap with | V4) while more than 99% of them 
are dominated by |vi) and |v2). As a result, the fitting function misses the average data points. In 
this sense, the failure of the full covariance fitting is obviously due to the fact that the covariance 
fitting tries to determine the coefficient of | V4) very precisely, while losing precisions in |vi) and 
I V2) direction. If the fitting function is exact, this procedure should yield a fitting result reasonably 
describing the data. However, if the fitting function has error in IV4) direction, this failing situation 
can happen. 

Table 1: Eigenmode decomposition of \y) and |/ t h) for the full covariance fitting. 
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1 


2 


3 


4 


CLi 


1.021(4) 


0.5655(14) 


0.1061(3) 


0.01442(3) 


bi 


1.014(4) 


0.5679(11) 


0.1058(3) 


0.01443(3) 



3. Prescriptions for the trouble 

If the covariance matrix has small eigenvalues, even a small error of fitting function may yield 
large error in fitting result. To circumvent this problem, we need some approximation methods, 
such as diagonal approximation or cutoff method. In subsection 3.1, we propose a new method 
which we call the eigenmode shift (ES) method. 



5 



Highly correlated data fitting 



Boram Yoon 



0.61 



0.6 - 



0.59 



0.5B 



0.57 



0.56 



cutoff method 




0.05 0.1 

X p (GeV 2 ) 

(a) Cutoff method 



0.15 



0.61 



0.6 - 



0.59 - 



0.58 



0.57 



0.56 



0.2 



eigenmode shift method 




0.05 



0.1 0.15 
X p (GeV 2 ) 



(b) Eigenmode shift method 



0.2 



Figure 2: Bk(1 /a) vs. Xp on the C3 ensemble. The left figure shows the result of cutoff method and the 
right figure shows the result of eigenmode shift method. 



One simple solution to the problem is to use the diagonal approximation (uncorrelated fitting). 
In this method, we neglect the off-diagonal covariance as follows: Cu = if i ^ j . In this way, the 
small eigenvalue problem disappears. The fitting results are shown in Fig. l(b)| . 

Another possible solution is to exclude the eigenmodes corresponding to the small eigenvalues 

! ^ ^ ^ , v A u 1 _ Q 



from the inverse covariance matrix, C-. 1 . In our example, |v4) is removed by setting — 
Eq. (|2T6|). We call this the cutoff method. A number of lattice QCD groups [Q, ^] use this method 



in the popular name of the SVD (singular value decomposition) method. In Fig. |2(a)| , we show the 
results of the covariance fitting using the cutoff method. 



3.1 Eigenmode shift method 

We know that the whole trouble comes from the error of fitting function in |v4.) direction. 
Hence, we can think of a new fitting function / t ' h denned as follows: 



/4(x) = / th (x)+T}|v 4 ). 



(3.1) 



Here, r\ is a tiny parameter that can be determined by the Bayesian method. Hence, we modify the 
X 2 as follows, 



Xnus 



2 (V 

x + — 



(3.2) 



We know that rj is very tiny so we choose a-q = 0. As mentioned in section [2.2| , the order of the 



neglected highest order term in the f^{X) is 0.006. Hence, we set = 0.006. Then we can do 
the full covariance fitting with an extra fitting parameter, v\ . When we do the extrapolation to the 
physical pion mass, we use only the f^{X) function, dropping out the v\ terms. We call this the 
eigenmode shift (ES) method. 
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This is the same procedure as following: First, find a shifting vector, 17^4), which minimizes 
the xlug- Then fit with the tuned(shifted) fitting function. To consider the statistical error of 7], do 
this procedure over jackknife or bootstrap samples. 

In our example, the fitted 77 = —0.00082(31), which is much smaller by an order of magnitude 
than truncated highest order terms in / t h- In Fig- [2(b) , we show the fitting results obtained using the 



ES method. This method tunes the fitting function by a tiny amount so that minimizes the small 
eigenvalue contribution. In this sense, it looks similar to the cutoff method. However, unlike the 
cutoff method, the ES method determines the shifting parameter, 77, using the Bayesian method 
and the full covariance matrix remains untouched. 

4. Conclusion 

Here, we address an issue of covariance fitting on the highly correlated Bk data. It turns out 
that the small error of fitting function can make the fitting fail if the covariance matrix has small 
eigenvalues. In order to get around the trouble, we have used approximations: the diagonal ap- 
proximation and the cutoff method. Here, we propose a new method, the eigenmode shift method, 
which fine-tunes the fitting function, while keeping the covariance matrix untouched. 
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